Spectral Tripartitioning of Networks 
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We formulate a spectral graph-partitioning algorithm that uses the two leading eigenvectors of 
the matrix corresponding to a selected quality function to split a network into three communities 
in a single step. In so doing, we extend the recursive bipartitioning methods developed by Newman 
[Proc. Nat. Acad. Sci. 103, 8577 (2006); Phys. Rev. E 74, 036104 (2006)] to allow one to consider 
the best available two-way and three-way divisions at each recursive step. We illustrate the method 
using simple "bucket brigade" examples and then apply the algorithm to examine the community 
structures of the coauthorship graph of network scientists and of U. S. Congressional networks 
inferred from roll-call voting similarities. 
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PACS numbers: 89.75.Fb, 05.10.-a, 89.65.Ef, 89.75.Hc 



I. INTRODUCTION 



Networks, or "graphs," provide a powerful represen- 
tation for the analysis of complex systems of interact- 
ing entities. This framework has opened a large array 
of analytical and computational tools, and the study of 
networks has accordingly become pervasive in sociology, 
biology, information science, and many other disciplines 
0, 0, [H, H, [||. The simplest type of network — an un- 
weighted, undirected, unipartite graph — consists of a col- 
lection of nodes (representing the entities) that are con- 
nected by edges (representing the ties/links). Important 
generalizations include weighted edges (ties with differ- 
ent strengths), directed edges (links from one node to 
another without reciprocation), and signed edges (e.g., 
ties interpreted as good or bad). Many networks in ap- 
plications are also bipartite, with two types of nodes and 
ties that always connect nodes of one type to those of the 
other 0]. 

To better understand the structural and functional 
organization of networks, it is useful to develop com- 
putational techniques to detect cohesive sets of nodes 
called "communities," which can be identified as groups 
of nodes that have stronger internal ties than they have 
to external nodes @, i, 0, 1, ©, E3, EH- The larger 
density of intra-community edges versus inter-community 
edges, relative to what one might expect at random, has 
been shown in many cases to correspond to increased 
similarity or association among nodes in the same com- 
munity. For example, communities in social networks 
might correspond to circles of friends, and communities 
in the World Wide Web might correspond to pages on 



closely-related topics. Over the last seven years, the 
detection of communities has become a particularly ac- 
tive and important area of network science 

@, 0,11 EH- 

Community-detection efforts have yielded several strik- 
ing successeSjOffering insights into college football rank- 
ing systems [j| EH , committee [HI, EH EH and cospon- 
sorship [l6| collaborations in the United States Congress, 
functional motifs in biological networks [TtI EH) social 



structures in cellular-phone conversation networks IS], 
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social organization in collegiate friendship networks 
and more. 

Available algorithms to detect communities take 
a variety of forms, including linkage clustering 21], 
betweenness- based methods (H, [22J, local techniques 
[HHi, HE HI], and spectral partitioning [13, HI]. Some 
of these approaches can be cast as computational heuris- 
tics for the optimization of quality functions, such as the 
global quantity of "modularity" (and variants thereof) 
[23 . [30 , [3U [32J or local quantities that more intimately 
measure the roles of links both between individual nodes 
and between/ within individual communities @, Q. Most 
community-detection algorithms can be classified into 
one of three categories: recursive partitioning, local ag- 
glomeration, or direct calculation into a final number of 
communities. The last approach tends to be computa- 
tionally expensive, whereas the first two can misappropri- 
ate nodes and typically require some heuristic choices in 
the development of the algorithm. In many community- 
detection methods, the constructed communities can also 
be layered in a hierarchical fashion, though the result- 
ing hierarchy might depend strongly on the algorithm 
employed rather than just on the hierarchy of communi- 
ties in the actual network. Some methods also allow one 
to study overlapping communities [H, [H, [25|, [26|, [33l ]. 
though in the present discussion we consider only parti- 
tioning into nonoverlapping communities. 

Given the expanse and constant, rapid advances in the 
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area of community detection, we do not endeavor to more 
fully catalog or compare the numerous methods that are 
now available in the literature. Such discussions are now 
available in several review articles @, 0, E3, El ■ Start- 
ing from the recognition that using spectral partitioning 
to optimize modularity and other similar quality func- 
tions is one of the (many) available and preferred means 
for identifying communities, we focus on that family of 
methods for the remainder of this paper. We present 
a fundamental extension to this class of methods and 
demonstrate the resulting improvement with some ex- 
amples. 

In traditional spectral partitioning, which arose most 
prominently in the development of algorithms for par- 
allel computation, one relates network properties to the 
spectrum of the graph Laplacian matrix p4 . [35j . In the 
simplest such procedure, one starts by partitioning a net- 
work into two subnetworks of specified size. One then 
examines the resulting subnetworks and further divides 
them if desired, continuing this procedure recursively for 
as many divisive steps as desired. After community de- 
tection became prominent in network science, spectral 
methods were generalized to include some algorithms for 
community detection, including work that includes steps 
that go bey ond two-way splits [36|, [37| ■ In a recent pair of 
papers [27], HH , Newman reformulated the idea of maxi- 
mizing modularity as a spectral partitioning problem by 
constructing a modularity matrix and using its leading 
eigenvector to spectrally partition networks into two sub- 
networks. He then applied recursive subdivision until no 
further divisions improved modularity, resulting in a fi- 
nal collection of communities whose number and sizes 
need not be specified in advance. This is an important 
feature for the study of communities in just about every 
social, biological, and information network application, 
as there is often no way to know the numbers and sizes 
of communities in advance. 

Given the NP-completc nature of modularity optimiza- 
tion [38| . the polynomial-time spectral algorithm does 
not guarantee a global optimum. Indeed, Ref. [28J] in- 
cludes a simple example of eight vertices connected to- 
gether in a line, in which the best partition found by 
recursive bipartition consists of 2 groups of 4 nodes each, 
whereas the exhaustively-enumerated optimum partition 
consists of 3 communities. The latter partition cannot be 
obtained by this recursive bipartition method because the 
initial split occurs in the middle of the line. Reference [28[ 
also explores some possibilities for using multiple leading 
eigenvectors of the modularity matrix but does not pur- 
sue the idea in detail beyond a two-eigenvector method 
for bipartitioning. 

In the present paper, we provide a valuable extension 
of the spectral partitioning methods for community de- 
tection in which we use two eigenvectors to tripartition a 
network (or subnetwork) into three groups in each step. 
This method can be combined with the one- and two- 
eigenvector bipartitioning methods to more thoroughly 
explore promising partitions for computational optimiza- 



tion of the selected quality function (either modularity 
or other choices) with only limited additional compu- 
tational cost. In developing this tripartitioning exten- 
sion, we employ a modified Kernighan-Lin (KL) algo- 
rithm [39| (see also Refs . l27Ll28j : other modifications of 
KL are also possible [H, |43||). We illustrate the resulting 
spectral method for community detection with the same 
nodes-in-a-line "bucket brigade" networks that are not 
always optimally partitioned by recursive bipartitioning. 
As examples, we then apply the method to similarity 
networks constructed from U. S. Congressional roll call 
votes [13, 5]1 l44l and to the graph of network scientist 
coauthorships [28(. In so doing, we include the impor- 
tant consideration that quality functions other than the 
usual modularity measure can be similarly used in such 
spectral partitioning (without otherwise altering the al- 
gorithm in any way) provided that they can be cast in a 
similar matrix form. 

The rest of this paper is organized as follows. In Sec- 
tion HQ we review the existing formulation for spectral 
community detection. In particular, we discuss how to 
recursively bipartition a network using either the leading 
eigenvector or the leading pair of eigenvectors of the mod- 
ularity matrix. In Section IIIII we present a theory (and 
polynomial-time algorithm) that extends these ideas to 
three-way subdivision (tripartitioning) using the leading 
eigenvector pair. In Section llVl we provide a faster im- 
plementation of this procedure in which we employ a 
restricted consideration of the possible cases and then 
leverage KL iterations to identify high-quality partitions. 
We subsequently present several examples in Section fVl 
highlighting situations in which allowing one to choose 
either two-way or three-way splits at each step in the re- 
cursion procedure results in higher-modularity partitions 
than recursive bipartitioning alone. Finally, we summa- 
rize our results in Section [VTl 



II. REVIEW OF SPECTRAL PARTITIONING 
USING MODULARITY 

In this study, we largely focus on the quality func- 
tion known as modularity [29l [30L l3l| , which we attempt 
to maximize for a given undirected network via spectral 
partitioning. We stress that our methods can be used 
with any quality function that can be cast in a similar 
matrix form. (We illustrate one such example in Sec- 
tion |Vl) Because the novel algorithms we present both 
extend and interface with existing spectral partitioning 
methods, it is necessary to review the essential elements 
of Refs. [27, 28] that recur in the subsequent presentation 
of our tripartitioning scheme. 

Starting from the definition of a network in terms of 
its nodes and (possibly weighted) edges, we denote the 
strengths of connections using a symmetric adjacency 
matrix A, whose components A^ = Aji codify the pres- 
ence and strength of connection between nodes i and j. 
In an unweighted network, each Aij component has a 
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binary {0, 1} value. Given some partition of the net- 
work, the modularity Q can be used to compare the to- 
tal weight of intra-community connections relative to the 
weight that one would expect on average in some speci- 
fied null model 

Q=-Lt E ^^Tr(S-BS), (1) 

where B is called the modularity matrix, m = | J^ij ^-ij 
is the total edge weight in the network, B^ = Aij — Py 
for a selected null model matrix P with elements P^, 
the sum over g runs over the c communities in the spec- 
ified partition, and the set of vertices G g comprises the 
gth community. The n-by-c "community-assignment ma- 
trix" S encodes the non-overlapping assignment ("hard 
partitioning" [7j) of each of the n nodes to the c commu- 
nities: 

„ _ J 1 , if node i belongs to community g, , , 
btg ~\0, otherwise. ( ' 

The most commonly studied null model for unipar- 
tite networks recovers the Newman- Girvan definition of 
modularity [30] , which is obtained by considering the en- 
semble of random graphs with independent edges condi- 
tional on having the same expected strength distribution 
as the original network. This gives Pij = kikj/(2m), 
where ki = ^ is the total edge weight ( "strength" ) 
of the ith node (equivalent to its degree in the unweighted 
case). One then constructs the modularity matrix B by 
subtracting these expected connection strengths from the 
connection weights in A. 

Importantly, modularity is not the only quality func- 
tion that can be cast in the form of fTJ) . In cases that are 
seemingly closer to uniform random graphs, such as those 
encountered in the study of network tie strengths inferred 
from similarities (as occurs, for example, when studying 
voting patterns in roll calls [44] ) , a uniform Py = p null 
model might be an appropriate alternative. Moreover, 
modularity does not always provide a suitable resolution 
of a network's community structure. One means of over- 
coming this deficiency, which gives the ability to exam- 
ine a network at different resolution levels, is to multi- 
ply the null model by a resolution parameter, yielding 
Pij = jkikj/(2m) [32[|45l]. (This is related to the sizes of 
communities obtained u sing random walk processes over 
different time intervals [461].) Another means of intro- 
ducing a resolution parameter is by addition of (possibly 
signed) self-loops along the diagonal of a modified adja- 
cency matrix and carrying the resulting changes into the 
usual modularity null model 47]. One can also consider 
directed networks with appropriate null models by using 
the symmetric part of B [48| . 

In the present work, we do not restrict ourselves by 
making any assumptions about the null model beyond its 
representation in terms of a B matrix in |T]). However, 
the application to large, sparse networks requires that 



this matrix have sufficient structure to enable efficient 
computation of the product Bv for arbitrary vectors v. 
All of the quality functions mentioned above have this 
property. Indeed, we will include consideration of the 
additional self-loops model in the Congressional roll call 
example. 

We now review the bipartitioning of nodes into two 
communities (not necessarily of equal size) using the lead- 
ing eigenvector of B. In calculating Q (or other relevant 
quality function), one can replace the role of the n-by-2 
community assignment matrix S by a single community 
vector s with components s,; = ±1 indicating the assign- 
ment of node i to one of two groups. Because \ {siSj + 1) 
eqiuvalently indicates whether nodes i and j have been 
placed in the same community, it follows that one can 
attempt to optimize Q s = =^-s T Bs. If the null model 
maintains . Bij = identically, then Q s — Q. For 
other null models, the difference between Q and Q s is 
a constant specified by the adjacency matrix and null 
model, so optimization of Q s is equivalent to optimiza- 
tion of Q. Consequently, the leading eigenvector Ui (as- 
sociated with the largest positive eigenvalue) of B gives 
the maximum possible value of v T Bv for real-valued v. 
Heuristically, one thus expects the community assign- 
ment indicated by Sj = sgn([ui]j) to give a large value 
of modularity. (When [ui]j = 0, one can set Si = ±1 
according to whichever choice gives larger modularity. If 
KL iterations are going to be used later to improve com- 
munity assignments, it is sufficient to use a simpler rule 
of thumb.) 

Similar ideas can be used to bipartition using mul- 
tiple leading eigenvectors (28|. Without specifying all 
of the details here, the p-eigenvector approach starts by 
the selection of n node vectors whose jth components 
(j G {1, . . . ,p}) are determined by 

Nj = Uij , 

where f3j is the eigenvalue of Uj (i.e., the jth ordered 
eigenvector of B), U = (ui|u2|---), and the constant 
a < P P is related to the approximation for Q obtained 
by using only the first p vectors (proceeding using as 
many positive (3j as desired). For computational conve- 
nience, we hereafter set a = (3 n . The modularity is then 
approximated by the relation 

c 

Q^Q = na + ^l R / > ( 3 ) 
9=1 

where the sum is over the number of communities (c) 
and the contribution of each community is given by the 
magnitude of the associated community vector R g = 
SjgG r '- Important to further developments in Ref. [28| 
and by us below is that the assignments of nodes to 
communities that maximize the sum in require that 
R g • Yi > if node i has been assigned to community g. 
By contrast, if R 9 -r^ < 0, then simply reassigning node i 
from community g to its own individual group increases 
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the modularity approximation ([3]) by 

AQ = |R S - r.,1 2 + | ri | 2 - |R g | 2 = 2|r,| 2 - 2R g • r, > . 

Similarly, all pairs of communities {g, h} in the partition 
that optimizes must be at least 90° apart (i.e., R g ■ 
R/j < 0), because the change in Q from merging two 
communities is 

|R S + R h \ 2 - (|R 9 | 2 + |R h | 2 ) = 2R 9 • R h . 

Because the maximum number of directions more 
than 90° apart that can exist simultaneously in a p- 
dimensional space is p + 1, the p-dimensional representa- 
tion of the vertices and communities restricts the spectral 
optimization of ([3]) to a partitioning (in a single step) into 
at most p + 1 groups. 

Leveraging the above geometric constraints, the 
bipartition-optimizing ([3J) must be equivalent to some bi- 
section of the node vector space by a codimension-one 
hyperplane that separates the vertices of the two com- 
munities. Computational use of this observation requires 
efficient enumeration of the allowed partitions. For in- 
stance, in the two-eigenvector planar case (p — 2, assum- 
ing at least two positive eigenvalues), only n/2 distinct 
partitions are allowed by the geometric constraints. As 
shown in Fig. [TJ each permissible partition is specified 
by bisecting the plane according to a cut line that passes 
through the origin [28]. The algorithm for spectral bi- 
partitioning by two eigenvectors then proceeds by con- 
sidering each of the n/2 allowed partitions and selecting 
the best available one. 

Recursive bipartitioning can then be used to split a 
network into as many communities as desired or until it 
can no longer further improve the value of the quality 
function. This recursive subdivision must be done with 
a generalized modularity matrix in order to properly ac- 
count for the contribution to modularity from further 
subdivisions of subnetworks [28[ . Specifically, the change 
in modularity given by subdivision of the nc-node group 
G into cg smaller groups specified by Si g can be recast as 
a similar spectral partitioning problem with the uq x uq 
generalized modularity matrix B' G ) taking the place of 
B. Its elements, indexed by the node labels i ,j <E G, are 
specified in terms of B by 

=B ij -5 ij Y d B a . (4) 

One can then implement such subdivisions recursively 
until modularity can no longer be increased with any 
additional partitioning. 

As an aside, we remark that spectral bipartitioning 
using any null model for which the rows of B^ (or its 
symmetric part) sum to zero has the additional property 
that Ri = — R2 because the zero row sums guarantee 
by eigenvector orthogonality that J2k = J2i r i = 0- 
Meanwhile, the generalized modularity matrix B^-* for 
recursive subdivision has zero row sums by construction. 
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FIG. 1: (Color online) Bisection in two dimensions (i.e., us- 
ing the leading pair of eigenvectors of B) for a small network. 
Solid (blue) lines with dots represent node vectors. The dot- 
ted (red) line represents a selected cut line in the plane. All 
nodes on one side of the line are assigned to one community 
and all nodes on the other side are assigned to the other com- 
munity. Rotating this line about the origin yields the set of 
possible planar bipartitions. (This figure is adopted from one 
in Ref. 0].) 



The absence of this property for the initial division for 
more general null models does not affect the implemen- 
tation of either any of the algorithms described above or 
of our recursive tripartitioning algorithm that we present 
below. 

Finally, we reiterate that individual partitioning steps 
and the recursive implementation of such steps can be 
employed as a computational heuristic for optimizing any 
quality function (not just modularity) that can be written 
in a matrix form similar to (fTJ) . Naturally, this approach 
is not the only available heuristic for optimizing a spec- 
ified quality function Jl} or its vector approximation (J3J) 
(see, e.g., the many references mentioned earlier). For 
instance, Ref. uses (|3j) as the basis of an eigenvector- 
ordered vector-partitioning algorithm that uses bisec- 
tion in each coordinate as the starting point for collect- 
ing nodes into the geometrically-constrained number of 
groups (which, we recall, is one more than the dimen- 
sion of the ambient space). In this sense, their algorithm 
has some similarities in two dimensions to the one we 
describe below. However, we take a different approach: 
We first establish in Section ITTT1 the relevant inequalities 
and geometry of the problem of tripartitioning (which 
requires generalizing the constraints we reviewed above) 
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and then propose a divide-and-conquer implementation 
strategy in Section [iVl We subsequently apply our tech- 
niques to an illustrative example and real-world networks 
in Section IVl 



a) 




III. EIGENVECTOR-PAIR TRIPARTITIONING: 
THEORY 



b) 




Recursive network bipartitioning, combined with sub- 
sequent KL iterations (which we describe and use in Sec- 
tion [TV}, rapidly produces high-quality partitions. How- 
ever, given the algorithms' polynomial running times, 
such values are not guaranteed to be optimal. Indeed, 
Ref. [28[ describes a simple case of an 8-node line segment 
(bucket brigade) network in which these algorithms miss 
the optimal-modularity partition. The recursive biparti- 
tioning procedure initially bisects the network into two 
groups (see Fig. [2h) and then terminates because no fur- 
ther subdivision improves modularity. In contrast, the 
modularity-maximizing partition of this small test net- 
work (which can be obtained by exhustive enumeration) 
consists of three communities (see Fig. [5b) that cannot be 
obtained by subsequent subdivision of the initial bisect- 
ing split. Figure [2] also shows a similar 20-node bucket 
brigade network that we will discuss in more detail in 
Section EI 

Motivated by the above example, the efficient two- 
vector bipartitioning algorithm, and the geometric con- 
straints that limit a p-dimensional node vector space rep- 
resentation to at most p + 1 communities (as described 
m Section [III), we consider whether a similarly-efficient 
mechanism exists for dividing the plane of node vectors 
into three groups in a single partitioning step. We start 
by considering the generalization of the cut line (illus- 
trated in Fig. [1} to a set of three non-overlapping wedges 
that fill the plane (as shown, e.g., in Fig. [3]). That is, 
instead of a single cut line that intersects the origin and 
bisects the plane, we ask whether the planar tripartitions 
that are geometrically permitted by ([3]) are equivalent to 
finding three rays emanating from the origin, with each 
(non-overlapping) wedge between these rays specifying 
the vertices of a group. That is, vertices are assigned to 
a community if they lie between the rays denoting that 
community's boundaries. 

As we now show, such non-overlapping wedges [52| de- 
scribe the geometric constraint that vertices whose vec- 
tors are located inside one wedge cannot be assigned to a 
community associated with another wedge in the parti- 
tion that maximizes ([3]) in the plane. Consider two planar 
community vectors, Ri and R2 (with Ri R2 < 0, as dis- 
cussed in Section [Tl|- Also suppose that there is a node 
vector i"o within 90° of each community vector, so that 
r • R 9 > for g e {1,2}. Without loss of generality, we 
introduce a second node vector ri in the (smaller-angle) 
region between ro and R2 . The proof of non-overlapping 
wedges for the optimization of then reduces to show- 
ing that if ri is assigned to the Ri group, then Tq must 
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FIG. 2: (Color online) The 8-node bucket brigade network 
discussed in Ref. [28j] and a similar 20-node bucket brigade 
network. Solid lines between nodes represent connections. 
Panel (a) shows the best initial bipartition (solid vertical line) 
of the 8-node network (no further subdivision increases mod- 
ularity). Panel (b) shows the modularity-maximizing parti- 
tion (dotted lines) of the same network. Note that the three- 
community partition in (b) cannot be obtained directly from 
(a) via subsequent partitioning. Panel (c) shows the maxi- 
mum modularity partition of the 20-node network into four 
communities (indicated by ovals around nodes and also by 
colors online), compared with the partition obtained via the 
best initial three-way division (vertical dotted lines), which 
is larger than that obtained by the initial bipartition (verti- 
cal solid line). In this case, the four- way partition cannot be 
obtained from recursive partitioning of the initial three-way 
division, but it is obtained from recursive partitioning of the 
bipartition. 



be assigned to Ri as well. That is, in the event that ro 
has been assigned to R2, the AQ01 improvement in mov- 
ing node to group 1 should be positive. Similarly, the 
condition that ri is (correctly) assigned to the Ri group 
requires both that ri • Ri > and that AQ12 < for 
moving node 1 to group 2, where 

AQ 12 = |R X - n| 2 + |R 2 + rx| 2 - |Rx| 2 - |R 2 | 2 
= 2|n|{|ri| -2|R 1 |cos6'n + 2|R 2 | cos6> 12 } 

and cos9 vg = r v ■ R g /(|r„||R g |). From the geometric 
ordering of the vectors specified above, it follows that 
< cos 6*n < cos #01 < 1 an d < cos #02 < cos6*i 2 < 1. 
The improvement in moving node to group 1 — that 
is, moving to a partition assignment by non-overlapping 
wedges — then becomes 

AQ01 = |Ri + r | 2 + |R 2 - r | 2 - |Ri| 2 - |R 2 | 2 
= 2|r o |{|r o | + 2|R 1 |cos0oi - 2|R 2 | cos ^2} 
> 2|r o |{|r o | + 2|R 1 |cos0ii - 2|R 2 | cos 12 } 

1 



= 2r 



In 



2|n 



AQ 



12 



> 0. 



Therefore, the optimal Q value in the plane must result 
from the assignment of nodes to groups equivalent to the 
specification of non-overlapping wedges. 
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FIG. 3: (Color online) Tripartitioning with vertex vectors 
from the leading pair of eigenvectors of B for a small network. 
Solid (blue) lines with dots represent node vectors, which have 
been rescaled according to the observed standard deviation of 
each component. The dotted (red) rays border a selected set 
of wedges in the plane, where each wedge indicates the set of 
nodes assigned to one group. One obtains the tripartitions 
allowed by the planar geometric constraints by rotating the 
boundaries of these wedges about the origin. 

Recall from Section [IT] that the bisection of the plane 
into two halves, indicated by selecting a cut line, yields 
only n/2 distinct cases from which to select the best bi- 
partition. However, the corresponding enumeration for 
three-way division, in which one selects three rays that 
border wedges, leaves 0(n 3 ) distinct partitions in the 
plane that must be enumerated and evaluated. While 
this provides an improvement over the original non- 
polynomial complexity of the partitioning problem, one 
still needs a more efficient heuristic for large networks 
to effectively employ such tripartitions (and subsequent 
subdivisions) at computational cost comparable with 
spectral bipartitioning. We present such a heuristic in 
Section [TV] 

IV. EIGENVECTOR-PAIR TRIPARTITIONING: 
FAST IMPLEMENTATION 

We accelerate the process of planar tripartitioning us- 
ing a divide- and- conquer approach that reduces the num- 
ber of considered configurations and, hence, the com- 
putational cost. This approach yields a method that is 
computationally competitive with two-eigenvector bipar- 



titioning even for large networks. In our implementation, 
we start at a coarse level of considering the four available 
tripartitions that can be obtained by unions of the quad- 
rants in the plane. Before dividing these regions further, 
we rescale the individual coordinates of the plane accord- 
ing to the standard deviations of the observed compo- 
nents along each coordinate, keeping their original val- 
ues for use in equation ([3]). This ad hoc rescaling makes 
the spatial distribution of vertex vectors more uniform 
and improves the efficiency of not only this coarse as- 
signment but also subsequent refinements. We then re- 
fine the quadrants by bisecting them into w — 8 wedge 
regions and individually consider each of the permissible 
unions of these new regions. If the best tripartition from 
w = 8 unions is better than that from quadrant unions, 
we bisect further to obtain w — 16 regions (which we 
then test), repeating this process of bisecting the pieces 
and finding their best union until the partition quality no 
longer improves (or no longer improves by some specified 
threshhold). 

Obviously, this divide-and-conquer approach does not 
consider all 0(n 3 ) possible planar tripartitions. In- 
deed, at the stage in which the plane has been subdi- 
vided into w — 2-? parts, one considers at most (™) = 
\w(w — l)(w — 2) neighboring unions of the w regions 
(including those already enumerated at smaller w val- 
ues). Some sets of nodes might repeat in this construc- 
tion, and many such unions do not meet the full geo- 
metric constraints on group vectors, but it is easiest to 
code a search over all (™) neighboring unions of the w 
regions. The quality of the best partition obtained us- 
ing such a small subset is of course unlikely to match 
the optimum obtained over all 0(n 3 ) allowed planar tri- 
partitions. However, as we demonstrate below and in 
Section [V] the resulting method does not appear to suf- 
fer from lower-quality communities when combined with 
subsequent KL iterations (which we now describe). 

The Kernighan-Lin (KL) iterative improvement 
scheme we use is a natural generalization 27] of the orig- 
inal method [3^| to the case where the sizes of commu- 
nities is not specified in advance. A single KL itera- 
tion step consists of moving vertices one at a time from 
their assigned community to a different community such 
that each move provides the largest available increase (or 
smallest decrease) in the quality function, subject to the 
constraint that each node is moved only once. That is, 
as soon as a node is moved, it is removed from the list 
of those available for consideration in upcoming moves in 
that step. These moves are selected independently of the 
geometric constraints placed on groups in the reduced 
eigenvector space; instead, the quality function is used 
directly to assess the value of each move. After all n 
nodes have been moved precisely once each, one selects 
the best available partition from the n that have been 
explored. If that partition is of higher quality than the 
initial one, a new KL iteration step is started from this 
new best state. Otherwise, the algorithm returns the ini- 
tial partition as its final result. 
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Because KL iterations typically improve the best parti- 
tions constructed from the recursive spectral bipartition- 
ing and tripartitioning algorithms, we recommend using 
them whenever possible. The relatively high quality of 
the spectral partitions typically manifest in a rapid con- 
vergence of the KL process. Moreover, the use of KL 
iterations is particularly important in light our divide- 
and-conquer approach for reducing the number of planar 
tripartitions we consider. Specifically, we do not con- 
sider all of the 0(n 3 ) permissible planar tripartitions. 
However, the post-KL results do not appear to suffer in 
quality despite the substantially-reduced number of con- 
figurations considered by our divide-and-conquer strat- 
egy. Consequently, the consideration of the full set of 
allowed planar wedges appears to be essentially unnec- 
essary because the improvement obtained from such ex- 
ceptional additional effort is overshadowed by the gains 
of the subsequent KL iterations. 

As an example of the efficiency of this divide-and- 
conquer plus KL approach, we consider the initial tripar- 
tition by modularity of the largest connected component 
of the coauthorship graph of network scientists, which 
has n = 379 nodes and 914 weighted edges (28|. (We will 
discuss this example in further detail in Section [Vl) The 
method does not identify a higher modularity among the 
allowed unions of w = 64 regions than that obtained at 
w = 32 (with Q = 0.5928). As a comparison, the best 
of the 0(n 3 ) allowed planar tripartitions has modular- 
ity Q = 0.6175, but comes at the cost of a greater than 
200-fold increase in the number of configurations that 
must be considered. In contrast, modularity is increased 
much more by using KL iterations after these two dif- 
ferent tripartitions, with Q = 0.6354 starting from the 
Q = 0.5928 partition, and Q = 0.6349 starting from the 
Q = 0.6175 partition. Observe that the higher post-KL 
modularity arises from the lower-modularity initial state, 
possibly due to an increased flexibility to move nodes be- 
tween different groups; applying KL iterations starting 
from the best spectrally-identified tripartition here stays 
stuck near a local maximum. Moreover, the best union at 
w = 32 in this example is only marginally (1.4%) better 
than that obtained at w = 16. One thus might make the 
algorithmic choice to only refine if the improvement in the 
quality function is better than some minimal threshold, 
because each refinement step of doubling w increases the 
computational cost of this divide-and-conquer approach 
by a factor of about 8. In the present work, we typically 
require the best modularity obtained with such a refine- 
ment to be at least 5% better than previous values in 
order to justify the further doubling of w. We also stop 
the i/;- refinement process when w becomes greater than 
the number of nodes in the network. 

Based on the above discussion, we hypothesize that 
the reduced number of partitions considered have broadly 
sampled the gross global configuration possibilities suf- 
ficiently well so that the post-KL result should still be 
close in quality (and, as we have seen in our examples, 
even better in some cases) to that obtained after KL 



starting from the best of the permissible planar parti- 
tions. We are intrigued by such thoughts but concerned 
that the complexity and detail of the local extrema of 
the selected quality function might not allow any general 
rigorous analysis. 

We thus proceed with some representative examples 
using the aforementioned collection of spectral partition- 
ing algorithms with subsequent KL iterations. We ob- 
tained the results in Section [V] using a recursive parti- 
tioning code that at each step selects the best partition 
from the available bipartitions and tripartitions (using 
the fast implementations described above), followed by 
KL iterations starting from the point at which no further 
improvement in the quality function can be identified by 
spectral subdivision. We also generalize this partitioning 
algorithm further (as described in Ref. [161 ]) by perform- 
ing additional spectral subdivisions as if each group to 
be divided were the full network itself, isolated from the 
other groups in the partition (i.e., by constructing each B 
matrix directly from the adjacency submatrix restricted 
to one subnetwork at a time). Such steps decrease the 
global quality of the resulting partitions, even though 
they increase the quality of partitioning each individual 
subnetwork in isolation. Despite the extra computational 
cost of this subnetwork-restricted partitioning procedure, 
we have found that using KL iterations after such ex- 
tended partitioning sometimes results in a higher global 
quality in the final partition. Unsurprisingly, these KL 
iterations correctly tend to merge a large number of the 
groups obtained from subnetwork-restricted partitioning 
during its search for the highest quality partition. 

Obviously, one can devise many different variants on 
the above ideas, and the efficacy of one's results may vary 
(with better performance for some choices in specific ex- 
amples). Additionally, one can significantly accelerate 
the spectral partitioning steps if the optimization in each 
step is based on the summed magnitude of the group vec- 
tors, approximating quality in ([3]), by efficiently updat- 
ing group vectors from one considered configuration to 
the next (as opposed to full recalculations). When con- 
sidering the full modularity (or other quality function) 
for different partitions, one can similarly accelerate the 
spectral steps using direct calculations of the differences 
in modularity between the configurations under consid- 
eration. 



V. EXAMPLES 

Maintaining our focus on the utility of considering tri- 
partitioning steps in recursive spectral partitioning with 
B eigenvectors, we proceed to consider example networks 
using the two implementations described in Section IIVI 
(with and without subnetwork-restricted partitioning). 
We specifically include examples in which one can im- 
prove community-detection results by allowing triparti- 
tioning steps and subnetwork-restricted modularity max- 
imization. We draw our examples from different areas: 
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the test "bucket brigade" networks of nodes connected 
to their nearest neighbors along a line segment, politi- 
cal networks constructed from voting similarities in U. S. 
Congressional roll calls [4l[ , and the coauthorship collab- 
oration graph of network scientists [28| . 



A. Bucket Brigade Networks 

Because the 8-node line segment spurred our inter- 
est in spectral partitioning, we briefly use this net- 
work and its generalizations to further illustrate our re- 
sults. The 8-node bucket brigade is the smallest nearest- 
neighbor line segment whose modularity-maximizing par- 
tition contains more than two communities. (The 7-node 
bucket brigade has two-community and three-community 
partitions with equal modularity.) However, the optimal 
partition (shown in Fig. (2b) cannot be obtained from 
recursive bipartitioning, which terminates after the ini- 
tial bisection of the network (see Fig. In contrast, 
the tripartitioning method identifies the optimal parti- 
tion in a single step. Subnetwork-restricted partition- 
ing gives another means to identify the optimal partition 
through spectral partitioning with KL iterations. Specif- 
ically, applying this procedure to the initial bisection in 
Fig. [2k- further bisects each group, giving four groups of 
two nodes each, from which KL iterations merge two of 
these four groups on its way to the optimal three-group 
partition. 

Yet another mechanism available to reach the optimal 
state is to allow the individual KL moves to place a node 
in a newly-created group of its own. If such moves are 
selected as tie-breakers over other moves that yield equal 
changes in modularity, then the optimal configuration of 
the 8-node bucket brigade can be obtained even in the ab- 
sence of tripartitioning and subnetwork-restricted mod- 
ularity maximization. However, we warn that allowing 
the formation of such new groups as possible KL moves 
can drastically increase the number of groups under con- 
sideration (in some cases significantly beyond that ob- 
tained using subnetwork-restricted partitioning), so the 
increase in computational cost in generalizing KL itera- 
tions in this manner might not be worthwhile in all sit- 
uations. KL iterations can, of course, be generalized in 
other ways, such as by modifying the stopping condition 
or tie-breaking methods. In our results presented, we 
break ties randomly and do not allow new groups to be 
created in KL moves (except when explicitly indicated). 

Contrasting the above, the 20-node bucket brigade, 
with a maximum modularity partition (Q = 0.5914) of 
four communities with five nodes each (see Fig. [3fc), pro- 
vides a cautionary illustration. Recursive bipartition- 
ing by itself initially bisects the bucket brigade into two 
groups of 10 (Q = 0.4474) and further subdivides each of 
those groups to correctly identify the optimal partition. 
However, an initial tripartition (shown in Fig. [2k), has 
higher modularity (Q = 0.5609) than any initial biparti- 
tion, and recursive spectral steps that take the better re- 



sult from bipartitioning and tripartioning terminate with 
a Q = 0.5720 partition of sizes n g = {4,3,6,3,4}, from 
which KL iterations yield a five-group n g = {4, 4, 4, 4, 4} 
partition (Q = 0.5886, less than 0.5% lower than the 
optimal). (For this example only, we order the num- 
bers in n g spatially along the line segment; we will typ- 
ically use n g to indicate community sizes without spa- 
tial reference.) Yet again, subnetwork-restricted par- 
titioning provides an improvement, yielding an n g — 
{2,2,3,3,3,3,2,2} partition from which KL iterations 
converge either to the optimal {5, 5, 5, 5} partition or 
the nearly-optimal {4, 4, 4, 4, 4} partition (depending on 
a random tie- breaking step). Another way out of this 
situation would be to generalize the implementation to 
allow forking and/or backtracking along the selected par- 
titioning steps. However, the potential downside is that 
this would allow so many choices that such an algorithm 
would become untenable on networks of any reasonable 
size. 



B. United States Congress Roll Call Votes 

To provide an interesting real-world example, we in- 
fer networks from Congressional roll call votes (obtained 
from Voteview f4lj ) based on voting similarity between 
legislators, which is determined according to the votes 
they cast (without using any political information about 
the content of the bills on which they voted). Noting 
that the definition of voting similarity is definitively not 
unique, we choose to define the weighted link between 
two legislators as the tally of the number of times they 
voted in the same manner on a bill (i.e., either both for 
it or both against it) divided by the total number of bills 
on which they both voted (thereby accounting for absten- 
tions and absences) during each two-year Congressional 
term. For the purposes of studying voting similarities 
as a network between legislators, we ignore the perfect 
similarity of a legislator with herself, setting An = 0. 

The purpose of this discussion is not an exhaustive po- 
litical analysis (which we present in Ref. [44]). Instead, 
we aim only to show examples in which the spectral par- 
titioning methods we have proposed are particularly ad- 
vantageous. We therefore ignore for the time being a 
number of important issues about the construction of 
the voting-similarity networks and the selection of the 
null model. For instance, given the dense nature of the 
inferred similarity network and the relative uniformity in 
the total edge strength distributions of the nodes, one 
might reasonably consider a uniform null model [321 ] in- 
stead of Newman- Girvan modularity, and it might also be 
insightful to explore resolution parameters @, [32j using 
either null model. Moreover, the entire similarity con- 
struction might be reasonably replaced by signed net- 
works that account separately for the agreements and 
disagreements between legislators (using an appropriate 
null model such as that discussed in Ref. [50j . which is 
also compatible with the partitioning methods proposed 
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in the present manuscript). 

Most of the highest modularity partitions identified in 
the (two-year) voting similarity networks of the House 
of Representatives and Senate across Congressional his- 
tory consist of two communities corresponding closely 
but not perfectly to the two major parties of the day. 
This is similar to previous findings on legislative cospon- 
sorship networks, for which all of the Houses in the avail- 
able data attain their highest modularity partition after 
the initial bipartition, corresponding closely to a Demo- 
crat/Republican split [16j . Although no direct tripar- 
titioning steps were used in Ref. [lq |. the subnetwork- 
restricted partitioning (discussed in Section HV|) was fre- 
quently able to identify splits in the communities that 
appeared to include groups of Southern Democrats and 
Northeastern Republicans who were not as tightly tied to 
their respective parties. Such results are consistent with 
political theories and observations about low-dimensional 
legislative policy spaces [4(J E[ > including the assertion 
that the essentially one-dimensional (Left-Right) legisla- 
tive spectrum typically observed in U. S. history has not 
held as strongly during times when issues related to slav- 
ery and civil rights have been of high legislative impor- 
tance. 

As an example of how this extra dimension in policy 
space can affect the detected communities, we use the 
tools of the present manuscript to explore the roll call 
voting similarity network of the 85th House of Represen- 
tatives (1957-1958), which passed the Civil Rights Act 
of 1957. This network includes all 444 Representatives 
who voted during the period (including midterm replace- 
ments) . All but 386 of the 98346 pairs of legislators have 
non-zero similarity weight in this network, with 251 of 
the empty similarities affiliated with House Speaker Sam 
Rayburn [D-TX], who chose to vote "present" (treated 
as an abstention) in all but one vote, in which he broke 
a tie on an amendment to the Interstate Commerce Act. 

Recursive bipartitioning of the 85th House identifies a 
partition with two communities of sizes n g — {230, 214} 
(with Q = 0.07935) and subsequent KL iterations do 
not result in a higher-modularity partition. As expected, 
such communities correspond reasonably but not per- 
fectly with party affiliation; the Republican community 
includes 16 Democrats, and the Democratic community 
includes 6 Republicans. Direct tripartioning yields three 
communities of sizes n g = {192,187,65} and a higher 
modularity (Q = 0.08019). Subsequent KL iterations 
yield a {197, 190, 57} partition with Q = 0.08063, which 
is the highest modularity we identified in this network us- 
ing the methods of the present manuscript [53] . Whether 
or not one allows tripartitions during a single step, the 
same partition is obtained by subnetwork-restricted par- 
titioning followed by KL iterations (though obviously by 
a more convoluted process in this case). The largest of 
these three communities includes 196 Republicans (all 
but 8 of them) and Democratic Speaker Sam Rayburn, 
the "misplacement" of whom arises from his having cast 
only the one aforementioned vote. The next largest com- 



munity is dominated by Democrats and includes the re- 
maining 8 Republicans. The smallest community consists 
entirely of Democrats from Southern states. 

To demonstrate the use of these methods with a qual- 
ity function other than modularity, we also studied voting 
similarities using the quality function with additional self 
loops proposed by Arenas et al. |47| . As emphasized ear- 
lier, any quality function that can be expressed with a B 
matrix as in ([T]) is amenable to these spectral partitioning 
methods. Adding such self-loops of weight r to the vot- 
ing similarity adjacency matrix and null model (54| . our 
procedure identifies the same three-community structure 
in the range —3.4 < r < 4.9 (with r — corresponding to 
the usual definition of modularity) . Below this range, the 
algorithm identifies a two-community partition; above 
this range, it identifies a four-community partition. Wc 
note that the use of the quality function with self-loops 
demonstrates a distinct downside in using subnetwork- 
restricted partitioning, as even modest positive values 
of r lead such extended subpartitioning to proceed all 
the way down to single-node groups, thereby utilizing 
significantly greater computing time without leading to 
any improvement in final communities than would be ob- 
tained by KL iterations starting simply from individual- 
node groups. We therefore do not advocate subnetwork- 
restricted partitioning for any resolutions significantly 
finer than that corresponding to the traditional defini- 
tion of modularity. 



C. Network Coauthorship 

The graph of coauthorships in network science publica- 
tions [28| has become a well-known benchmark example 
in the network science community. The largest connected 
component of this network consists of 379 nodes, repre- 
senting authors, with 914 weighted edges indicating the 
coauthored papers between pairs of scientists. 

Without the use of tripartitioning steps, spectral re- 
cursive bipartitioning yields aQ = 0.8188 partition with 
29 communities of various sizes. KL iterations starting 
from this state yields a Q = 0.8409 partition with 27 com- 
munities (subject to random tie-breaking instances). In 
this example, the subnetwork-restricted modularity max- 
imization does not improve the collection of communities 
that one obtains; rather, it generates a partition of 120 
subcommunities that KL iterations subsequently merge 
into a Q = 0.8190 partition with 37 communities. That 
is, while the modularity obtained in this manner is bet- 
ter than that from the original recursive bipartitioning, 
it is not as good as that obtained when KL iterations 
are used directly after the modularity-increasing bipar- 
titioning terminates. In contrast, using KL iterations 
that allow new groups to be created proceeds without 
ever generating such new groups when started from the 
29-group partition obtained by recursive bipartitioning 
(even with formation of new groups selected in any tie 
breaking). Hence, we note that subnetwork-restricted 
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modularity maximization yields better results (i.e., a 
higher-modularity partition) in some cases, whereas KL 
iterations that allow the creation of new groups yield bet- 
ter results in others. 

When tripartitioning steps are used along with biparti- 
tioning, one obtains a final partition (after KL iterations) 
with a higher modularity, even though some of the inter- 
mediate states have lower modularity. Taking the best 
available division at each stage, the algorithm first splits 
the network into three groups. Each of these is then fur- 
ther divided three ways, eventually yielding a Q = 0.8032 
partition with 39 communities. Even though this mod- 
ularity is lower than that obtained using spectral bipar- 
titioning by itself, applying KL iterations at this stage 
yields a Q = 0.8427 partition with 24 communities, which 
is slightly better than the best result described above. 
In this case, too, the subnetwork-restricted modularity 
maximization plus KL iterations yields a final partition 
with lower modularity (Q = 0.8220). Such results high- 
light an important point: Using multiple combinations 
of these methods might give better results than fixating 
on any specific combination. 

The initial tripartitioning of the full network is itself 
interesting as an example of our divide-and-conquer ap- 
proach, as it illustrates the extent to which KL itera- 
tions can improve a spectrally-obtained partition. It also 
provides a compelling visualization (see Fig. 0]) of the 
three-way division of the vertex vectors in the plane. The 
partition obtained after the initial split contains groups 
of sizes {164,118,97} with Q = 0.5928. Applying KL 
iterations to this partition moves 54 of the nodes and 
gives a Q = 0.6354 partition (see Fig. 2]) with groups 
of sizes {136,128,115}. (Note that, in contrast to this 
example, we do not typically apply KL iterations after 
each recursive subdivision step; instead, we apply them 
after exhausting spectral techniques.) Although KL it- 
erations moved a significant fraction of the nodes, the 
regions in Fig. 2] nevertheless appear to resemble non- 
overlapping wedges because the nodes that were moved 
are all too close to the origin to visualize the overlap in- 
duced by the KL iterative improvement. For convenience, 
we have indicated in Fig. 2] the last names of the 10 au- 
thors (some of whom are rather familiar) with largest 
vertex vector magnitudes. Finally, although this three- 
community partition illustrates some of the well-known 
research camps in network science, it is important to re- 
member that the modularity-maximizing partition of this 
network has many more than three communities. 



VI. CONCLUSIONS 

We have presented a computationally-efficient method 
for spectral tripartitioning of a network using the leading 
pair of eigenvectors of a modularity matrix. Our algo- 
rithm, which can be applied without modification to a 
broad class of quality functions, extends the previously- 
available methods for spectral optimization of modular- 
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FIG. 4: (Color online) Two-dimensional vertex vector coor- 
dinates of the 379-node largest connected component of the 
network scientists coauthorship graph. Colors/shapes denote 
the post-KL three-community partition discussed in the text. 



ity in Refs. [27], HH. Paired with spectral bipartitioning 
in a recursive implementation with subsequent KL iter- 
ations and an optional subnetwork-restricted modularity 
maximization extension, the inclusion of possible tripar- 
titioning steps significantly expands the possible parti- 
tions that can be efficiently considered in the heuristic 
optimization of modularity or any of the wide class of 
quality functions that can be expressed in similar ma- 
trix form (such as those that generalize modularity using 
a multiplicative resolution parameter [32j or self-loops 

Our investigation also provides an important caution- 
ary tale about community detection in networks. Despite 
a wealth of recent research on this subject, it sometimes 
remains unclear how to interpret the results of graph par- 
titioning methods and which methods are most appropri- 
ate for which particular data sets [1, 0, HH, H(| ■ While 
recursive subdivision seems to give some hierarchical in- 
formation about network structure and how nodes are 
grouped (see Section El and Refs. [1 QJ, M, EI O), 
the hierarchies that one obtains might indicate as much 
about the algorithms employed as they do about any true 
hierarchical structures of communities (see the discus- 
sion in Ref. (sTj] ) . Moreover, it is important to stress 
that the process of always taking the best modularity 
at each divisive step can lead to states that are not as 
good as might have been obtained from other choices in 
the forking decision process, and that the best post-KL 
partitions are not always obtained from the best avail- 
able pre-KL states. Because of the necessarily heuristic 
nature of trying to obtain a high quality partition in poly- 
nomial time, it is beneficial to have access to a variety of 
computationally-efficient tools with which to explore the 
complicated landscape of possible community partitions. 
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